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Lysosomes are cytoplasmic organelles present in almost all eukaryotic cells, which play a fundamental role in key aspects of 
cellular homeostasis such as membrane repair, autophagy, endocitosis and protein metabolism. The characterization of the 
genes and enzymes constituting the lysosome represents a central issue to be addressed toward a better understanding of 
the biology of this organelle. In humans, mutations that cause lysosomal enzyme deficiencies result in >50 different 
disorders and severe pathologies. So far, many experimental efforts using different methodologies have been carried 
out to identity lysosomal genes. The Human Lysosome Gene Database (hLGDB) is the first resource that provides a com- 
prehensive and accessible census of the human genes belonging to the lysosomal system. This database was developed by 
collecting and annotating gene lists from many different sources. References to the studies that have identified each gene 
are provided together with cross databases gene related information. Special attention has been given to the regulation of 
the genes through microRNAs and the transcription factor EB. The hLGDB can be easily queried to retrieve, combine and 
analyze information on different lists of lysosomal genes and their regulation by microRNA (binding sites predicted by five 
different algorithms). The hLGDB is an open access dynamic project that will permit in the future to collapse in a unique 
publicly accessible resource all the available biological information about lysosome genes and their regulation. 

Database URL: http://lysosome.unipg.it/ 



Introduction 

Lysosomes are cellular organelles that play a pivotal role in 
the cell homeostasis through their involvement in degrad- 
ation and recycling processes of extracellular material that 
has been internalized by endocytosis and intracellular com- 
ponents that have been sequestered by autophagy (1). 
Lysosomes may also fuse with the plasma membrane, emp- 
tying their contents outside the cell. This is important for 
processes such as cellular immune response and plasma 
membrane repair, both in normal and pathological 



conditions (2, 3). Mutations that cause lysosomal enzyme 
deficiencies result in different syndromes, known as 
Lysosomal Storage Disorders (LSDs) (4). Most of the LSDs 
are associated with abnormal brain development and 
mental retardation. In addition, they are characterized by 
intracellular deposition and protein aggregation, events 
also found in age-related neurodegenerative disorders, 
such as Alzheimer's and Parkinsons's diseases (5-7). These 
studies underline the importance of the lysosome as a cen- 
tral player in cell metabolism. Hence, the characterization 
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of genes participating in lysosomal biogenesis and function 
is a critical step toward the understanding of basic pro- 
cesses in cell biology and pathogenic mechanisms in many 
human diseases. 

Recently, it was found that most lysosomal genes exhibit 
coordinated transcriptional behavior and are regulated by 
the transcription factor EB (TFEB), which also links autop- 
hagy to lysosome biogenesis (8, 9). Gene expression at the 
post-transcriptional level can be regulated by microRNAs 
(miRNAs). miRNAs play important roles in diverse biological 
processes, including development, cell differentiation, pro- 
liferation and apoptosis, in which the lysosomal system also 
plays an important role. Notably, miRNAs have been re- 
cently identified as involved in the regulation of autophagy 
(10, 11). 

The Human Lysosome Gene Database (hLGDB) is the first 
searchable database focused on the census of genes be- 
longing to the lysosomal system and on their regulation 
by miRNAs. No database resources entirely dedicated to 
the regulation of lysosomal genes by miRNAs or other regu- 
lators are currently available. Several lists of lysosomal 
genes were collected from public gene databases, pub- 
lished proteomics articles and reviews edited by biochem- 
ists and cell biologists working in the lysosome field. Many 
different algorithms are available for miRNAs binding site 
prediction (12-14). Five are currently present in the data- 
base. We paid special attention on balancing predictions, 
which were as follows: (i) more suitable to look for con- 
firmatory evidence (TargetScanS) (15); (ii) more suitable to 
identify any possible target for a particular miRNA, to form 



the basis for in vitro or in vivo experiments (picTar four-way 
and five-way) (16); (iii) more suitable to find in silico evi- 
dence for the interaction between a miRNA and a gene of a 
certain family or function (PITA, miRanda) (17). To increase 
miRNA-target mRNA information, experimentally verified 
miRNA targets from miRTarBase were also reported. 

hLGDB aims to providing a useful resource to anyone 
studying the lysosomes and a tool for identifying 
common regulatory features of lysosomal genes. hLGDB 
provides a user-friendly interface through which informa- 
tion can be easily retrieved, including the union and inter- 
section of different gene lists, searches for miRNA 
predictions and visualization on the gene transcript se- 
quence of the miRNA target predictions. 

Database Construction 

The data reported in the current version 1.1, derived from 
NCBI PubMed searches for review articles regarding human 
(8, 18-25) and murine (22, 26-29) proteome of the lyso- 
some and from lists of lysosomal genes present in the 
Gene Ontology (30), KEGG (31), Reactome (32) and 
UniProt databases (33) [Uniprot: 'Lysosome (KW-0458)' 
AND organism: 'Homo sapiens (Human) (9606)'; KEGG: 
'Lysosome (ko04142)'; GO: GO:0005764 data stamp from 
the source 20120303]. The references listed within each 
review article (34, 35) were examined and lists of genes 
were extracted from either the full text or the supplemen- 
tary information following a manual curation. 
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Figure 1. Home page of hLGDB with search options. 
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Figure 2. Gene information using 'CTSF' as example, (a) Summary of gene-centered information, (b) View of miRNA-binding sites 
on gene transcript. 



hLGDB currently contains 435 genes. There are 16 
sources of information divided in four main categories: 
Proteomic Studies, Databases, Reviews and System Biology 
Approaches. Each gene has been associated to its Official 
HGNC Gene Symbol (36) and to its Entrez Gene ID (map- 
pings were based on data provided by Entrez with a date 
stamp from the source of 7 March 2012). The gene tran- 
scripts associated to each gene are annotated accordingly 
to NCBI RefSeq or GenBank (release 57). miRNA target pre- 
dictions were extracted from the tables downloaded from 
the websites of the different algorithms used to predict the 
binding between miRNA and gene transcripts. Coordinated 
Lysosomal Expression and Regulation (CLEAR) is a nucleo- 
tide motif (GTCACGTGAC) found to be highly enriched in 
the promoter set of lysosomal genes (8). We mapped this 
motif on both strands on the human genome (hg19) by 
means of fuzznuc utility of the EMBOSS package allowing 
one single mismatch (37). The binding sites of the TFEB 
come from a Chip-seq experiment carried out on HeLa 
cell lines (38). 



hLGDB is a MySQL 5.0.95 database (constructed in the 
fourth normal form, some redundancy being kept to in- 
crease retrieval performance), and the interface is built in 
PHP. 

Database Description and Utility 

Search for gene lists: hLGDB can be used to retrieve and 
combine lists of lysosomal genes from different sources. 
The ANY and ALL options allow the user to either merge 
or intersect lists. Search for a gene using the gene symbol is 
also allowed (Figure 1). 

Search for miRNA targets: once the user has selected or 
created a gene list, he/she can find miRNA (or families of 
miRNA) targets choosing different combinations of predic- 
tion softwares. Results are returned in a table showing in- 
formation about the gene (Gene Symbol and Gene Name) 
and the miRNA (identifier and number of softwares pre- 
dicting each binding). Gene Symbols are hyperlinked to 
gene-centered page where additional gene information is 
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Figure 2. Continued. 
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provided (Figure2a). On each gene transcript associated to 
the selected gene, miRNA binding sites are annotated 
(Figure2b). 

Search for TFEB binding sites/CLEAR motifs: once the user 
has selected or created a list, he/she can choose the filter 
and find the lists of genes with TFEB binding sites or CLEAR 
motifs within a range around the transcription start site 
(TSS). The range is user-defined specifying downstream 
and upstream boundaries around the TSS. Results are pro- 
vided in the table showing gene-centered information, as 
the hit count and distance from TSS of TFEB binding sites/ 
CLEAR motifs. 

All data in hLGDB are freely available for download as 
tab-delimited text files without password protection for 
any user. Concerning other organisms, currently the data- 
base provides a parallel orthology annotation for mouse: 
the user can select the species of interest using the upper 
right botton. 



Discussion and Future Direction 

hLGDB is a database that focuses on human lysosomal 
genes. It collects information about these genes and their 
transcriptional regulation such as TFEB binding sites and 
miRNAs. hLGDB was designed to become a lysosomal 
gene census. When new lysosomal genes will be discovered 
they will be added when the database is updated. 
Lysosomal genes include lysosomal hydrolases, lysosomal 
membrane proteins, lysosomal proteins involved in acidifi- 
cation and non-lysosomal proteins fundamental for this or- 
ganelle biogenesis. Currently >50 recessive inherited 
diseases are associated with lysosomal gene dysfunction. 
In addition, there is increasing evidence that lysosome 
genes play a role in the pathogenesis of common neurode- 
generative diseases such as Alzheimer's, Parkinson's and 
Huntington's. 

Researchers may benefit from hLGDB because they have 
in a single reference to the broadest compendium of 
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lysosomal gene lists. They can search for miRNA targets 
combining up to six different methods. Results of miRNA 
targets may be directly compared with other transcriptional 
regulation elements such as the distance from the TSS of 
TFEB binding site or the distance to a CLEAR sequence to 
identify common features of regulation. 

hLGDB has been designed to integrate additional layers 
of biological information, such as experimental data and 
comparative genomics. Currently the database present in- 
formation for human and mouse species; in the next ver- 
sions, additional species, such as rat, will be integrated. 
Finally, hLGDB provides a powerful resource to system biol- 
ogy approaches and network analysis to dissect the map of 
interactions taking place in the lysosomal system. 
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